Linguistic Processing of Texts Using Geppetto

نویسندگان

  • Fabio Ciravegna
  • Alberto Lavelli
  • Fabio Pianesi
چکیده

We describe the linguistic analyzer of a prototype for Information Extraction from texts. Such analyzer uses information derived from a shallow processor to limit the computational cost of the analysis. At the same time, shallow techniques are used to collapse parse fragments when a complete parse is not possible. The linguistic analyzer has been built using GePpeTto, an environment that allows the development and integration of diierent linguistic resources and processors. GePpeTto includes: graphical tools for editing and debugging linguistic data, a repertoire of parsers. In this paper, we sketch the architecture of the Information Extraction system, then we show how it is possible to build a linguistic analyzer using GePpeTto. Finally, we present the results of some experiments carried on a corpus of Italian economical short news. Summary We describe the linguistic analyzer of a prototype for Information Extraction from texts. Such analyzer uses information derived from a shallow processor to limit the computational cost of the analysis. At the same time, shallow techniques are used to collapse parse fragments when a complete parse is not possible. The linguistic analyzer has been built using GePpeTto, an environment that allows the development and integration of diierent linguistic resources and processors. GePpeTto includes: graphical tools for editing and debugging linguistic data, a repertoire of parsers. In this paper, we sketch the architecture of the Information Extraction system, then we show how it is possible to build a linguistic analyzer using GePpeTto. Finally, we present the results of some experiments carried on a corpus of Italian economical short news.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing Language Resources and Applications with GEPPETTO

In the development of LE applications there are two crucial areas: architectural design and resource development. In this paper we describe the design phase, current status, and future directions of a development environment, called GEPPETTO, whose aim is to address the two issues within the same environment. The design of GEPPETTO has been carried out using a user-centered approach, to secure ...

متن کامل

DAFOE: a Platform for Building Ontologies from Texts

Although text-based ontology engineering gained much popularity in the last 10 years, very few ontology engineering platforms exploit the full potential of the connection between texts and ontologies. We propose DAFOE, a new platform for building ontologies with a terminological component using different types of linguistic entries (text corpora, results of natural language processing tools, te...

متن کامل

A Cross-linguistic and Cross-cultural Study of Epistemic Modality Markers in Linguistics Research Articles

Epistemic modality devices are believed to be one of the prominent characteristics of research articles as the commonly used genre among the academic community members. Considering the importance of such devices in producing and comprehending scientific discourse, this study aimed to cross–culturally and cross-linguistically investigate epistemic modality markers as an important subcategory...

متن کامل

A Comparative Study of Discourse Markers: The Case of three English Applied Linguistic Texts with their Farsi Translations

This research was an attempt to find the relationship between English discourse markers and their Farsi translations. It was conducted in order to find out whether DMs translations completely demonstrate source texts orientation and to what extent DMs translations are functionally appropriate compared to the original text? Six instruments were used. Three of them were the original English books...

متن کامل

Using decision trees to select the grammatical relation of a noun phrase

Abs t rac t We present a machine-learning approach to modeling the distribution of noun phrases (NPs) within clauses with respect to a finegrained taxonomy of grammatical relations. We demonstrate that a cluster of superficial linguistic features can function as a proxy for more abstract discourse features that are not observable using state-of-the-art natural language processing. The models co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995